Autoregressive Neural F0 Model for Statistical Parametric Speech Synthesis
نویسندگان
چکیده
منابع مشابه
Multilevel parametric-base F0 model for speech synthesis
This paper proposes a new F0 model for speech synthesis based on the parameterization of the logF0 contour of the syllables. This parameterization consists of the N -order discrete cosine transform (DCT) plus some additional parameters such as the gradient of the syllable average pitch. A statistical model of the syllable pitch contour is then created by clustering the parameterized vectors wit...
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملRecurrent Neural Network Postfilters for Statistical Parametric Speech Synthesis
In the last two years, there have been numerous papers that have looked into using Deep Neural Networks to replace the acoustic model in traditional statistical parametric speech synthesis. However, far less attention has been paid to approaches like DNN-based postfiltering where DNNs work in conjunction with traditional acoustic models. In this paper, we investigate the use of Recurrent Neural...
متن کاملA Hierarchical Encoder-Decoder Model for Statistical Parametric Speech Synthesis
Current approaches to statistical parametric speech synthesis using Neural Networks generally require input at the same temporal resolution as the output, typically a frame every 5ms, or in some cases at waveform sampling rate. It is therefore necessary to fabricate highly-redundant frame-level (or samplelevel) linguistic features at the input. This paper proposes the use of a hierarchical enco...
متن کاملTemporal modeling in neural network based statistical parametric speech synthesis
This paper proposes a novel neural network structure for speech synthesis, in which spectrum, F0 and duration parameters are simultaneously modeled in a unified framework. In the conventional neural network approaches, spectrum and F0 parameters are predicted by neural networks while phone and/or state durations are given from other external duration predictors. In order to consistently model n...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM Transactions on Audio, Speech, and Language Processing
سال: 2018
ISSN: 2329-9290,2329-9304
DOI: 10.1109/taslp.2018.2828650